home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group98a.txt
/
000133_icon-group-sender _Fri Mar 13 16:34:07 1998.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
4KB
Return-Path: <icon-group-sender>
Received: from kingfisher.CS.Arizona.EDU (kingfisher.CS.Arizona.EDU [192.12.69.239])
by baskerville.CS.Arizona.EDU (8.8.7/8.8.7) with SMTP id QAA08003
for <icon-group-addresses@baskerville.CS.Arizona.EDU>; Fri, 13 Mar 1998 16:34:07 -0700 (MST)
Received: by kingfisher.CS.Arizona.EDU (5.65v4.0/1.1.8.2/08Nov94-0446PM)
id AA17167; Fri, 13 Mar 1998 16:34:06 -0700
Message-Id: <3509AE03.729A@gte.net>
Date: Fri, 13 Mar 1998 16:06:59 -0600
From: Mark Evans <evans@gte.net>
Reply-To: evans@gte.net
Organization: None
X-Mailer: Mozilla 3.01 (Win95; I)
Mime-Version: 1.0
To: icon-group@optima.CS.Arizona.EDU
Subject: Re: Letter Probabilities
References: <199803131730.LAA18482@axp.cmpu.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Errors-To: icon-group-errors@optima.CS.Arizona.EDU
Status: RO
Content-Length: 3475
To the group -
Several people have simultaneously suggested the generator string idea.
This was the first idea that came to mind when I faced the problem
originally. See my answer to eka@corp.cirrus.com (Eka Laiman) for my
comments.
The probability table is simply a requirement for output. As long as
I'm going to compute it anyway, it's useful. When you generate random
text without computing it, there is no way to tell from the output what
are relative probabilities except by very, very gross estimation. The
table tells you in a glance, top to bottom.
In English, the space character is always first, followed by lower case
'e' with probability about 0.10. Some results are counterintuitive,
such as 'y' happening 50% more often than 'b' in the sample below
(computed from a small portion of "Moby Dick").
I have only been at Icon for a few weeks and think I have a firm grasp
of it. I don't have any "C mentality" problems. If I did, then I would
not have bothered asking the group if there were a more elegant Icon
method. I would certainly not have asked for an Icon-->C converter!
I've used a number of different languages and know how to adapt.
Actually my little program has grown into a moderately complicated Icon
case study. I've bumped against the 32K limit, that's for sure. It has
buttons, menus, all kinds of things going on.
No one has really answered my original question about the inner while
loop. Whether it is ideal for this problem or not, I would like to know
whether Icon has some elegant mechanism for scanning such an ordered
list.
I will append a sample table for everyone's curiosity.
Mark
__________________________________________________
[letter frequencies]
" "<--->0.1751922190691018
"e"<--->0.09672803124014646
"t"<--->0.07254602343010987
"o"<--->0.06209221664362462
"a"<--->0.06182541415023404
"s"<--->0.05256009119794318
"n"<--->0.05175968371777146
"i"<--->0.0484610347085789
"h"<--->0.04632661476145431
"r"<--->0.04501685706662785
"l"<--->0.0317980062577312
"d"<--->0.03043973901865191
"u"<--->0.02127143515486672
"m"<--->0.01979189405515535
"g"<--->0.01763321933590433
"c"<--->0.01717237866550243
"f"<--->0.01707535957699677
"w"<--->0.01554730893303257
"y"<--->0.01554730893303257
"p"<--->0.0151349778068835
","<--->0.0151349778068835
"\n"<--->0.010089985204589
"b"<--->0.009968711343956922
"v"<--->0.007397705498556839
"."<--->0.006306240752868125
"-"<--->0.00616071212010963
"k"<--->0.005821145310339808
"I"<--->0.004826699653156758
";"<--->0.001843362681607606
"T"<--->0.001503795871837784
"?"<--->0.00140677678333212
"B"<--->0.001309757694826457
"W"<--->0.001309757694826457
"S"<--->0.001091464745688714
"N"<--->0.001042955201435882
"A"<--->0.0009701908850566347
"C"<--->0.000921681340803803
"x"<--->0.0008974265686773871
"z"<--->0.0008246622522981394
"j"<--->0.0007033883916660602
"q"<--->0.0006548788474132285
"!"<--->0.0006306240752868126
"P"<--->0.0006306240752868126
"'"<--->0.0006063693031603967
"H"<--->0.0005093502146547332
"F"<--->0.0004850954425283173
"L"<--->0.0004365858982754856
"M"<--->0.0004123311261490697
"D"<--->0.000363821581896238
"E"<--->0.0003395668097698222
"G"<--->0.0003153120376434063
"R"<--->0.0002910572655169904
"Y"<--->0.0001940381770113269
"O"<--->0.0001455286327584952
")"<--->9.701908850566347e-5
":"<--->9.701908850566347e-5
"("<--->9.701908850566347e-5
"J"<--->9.701908850566347e-5
"V"<--->4.850954425283173e-5
"U"<--->4.850954425283173e-5
"Q"<--->4.850954425283173e-5